/blog/
Back to Machine Learning
It’s been a while since I checked in with Machine Learning tools, so here’s a post which is mostly me getting back into it. This is more or less from scratch as I haven’t done any work on my current laptop with machine learning, much less any work on PopOS in Python beyond a few simple scripts. So if you have Python and Pip installed, and you are on PopOS or maybe an Ubuntu based distro, this is sort of a walk through of how things setup.
This is the method of setting up these tools without Anaconda. While Anaconda seems great for beginners in Python, I’ve seen it cause all sorts of dependency issues when running in parallel with standard Python installs.
First make sure your Pip is up to date and you are not in some Python virtual environment:
gatewaynode@pop-os:~$ pip install --upgrade pip
Collecting pip
Downloading pip-21.0.1-py3-none-any.whl (1.5 MB)
|████████████████████████████████| 1.5 MB 4.5 MB/s
Installing collected packages: pip
Successfully installed pip-21.0.1
Then install TensorFlow with pip. I seem to remember these need to be installed widely enough for Jupyter Notebooks(for any of you poor fools following this as tutorial of sorts, notebooks are Python in an interactive browser based console) to use, so at least as your user, possibly as system global installs.
gatewaynode@pop-os:~$ pip install tensorflow
Collecting tensorflow
Downloading tensorflow-2.4.1-cp38-cp38-manylinux2010_x86_64.whl (394.4 MB)
...
<snip a lot of dependencies and such />
...
Successfully installed absl-py-0.11.0 astunparse-1.6.3 cachetools-4.2.1 flatbuffers-1.12 gast-0.3.3 google-auth-1.26.1 google-auth-oauthlib-0.4.2 google-pasta-0.2.0 grpcio-1.32.0 h5py-2.10.0 keras-preprocessing-1.1.2 markdown-3.3.3 numpy-1.19.5 opt-einsum-3.3.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 requests-oauthlib-1.3.0 rsa-4.7 tensorboard-2.4.1 tensorboard-plugin-wit-1.8.0 tensorflow-2.4.1 tensorflow-estimator-2.4.0 termcolor-1.1.0 typing-extensions-3.7.4.3 werkzeug-1.0.1 wheel-0.36.2 wrapt-1.12.1
We’ll still most likely need a few more dependencies for the Notebooks to do some of their fancy-fancy.
gatewaynode@pop-os:~$ pip install scipy
Collecting scipy
Downloading scipy-1.6.0-cp38-cp38-manylinux1_x86_64.whl (27.2 MB)
...
And we’ll need the Notebook server itself (runs locally to serve the notebook to your browser).
gatewaynode@pop-os:~$ pip install jupyterlab
Collecting jupyterlab
Downloading jupyterlab-3.0.7-py3-none-any.whl (8.3 MB)
...
I like to review the dependencies so I at least have a small chance of knowing what is installed on my system. Something that stands out for me on jupyterlab
is the prometheus-client-0.9.0
, this worries me a bit. Prometheus is an application performance monitoring tool for time-series data. There are security implications to this client being setup by default, I’d be more worried about the server installed but this is concerning as well. I hope it doesn’t setup a default connection or is connectable from open source models I want to try to some random Prometheus server in Seychelles.
Launch the notebook server to make sure everything is working:
gatewaynode@pop-os:~$ jupyter-lab
...
Since I’m getting back into machine learning from a security perspective a few things stand out that might be useful if any future vectors present themselves.
[I 2021-02-13 23:34:53.037 ServerApp] Writing notebook server cookie secret to /home/gatewaynode/.local/share/jupyter/runtime/jupyter_cookie_secret
...
[I 2021-02-13 23:34:53.048 ServerApp] http://localhost:8888/lab?token=3f3205496238dcc675a68eee723bd59b572c6c69d616be62
...
To access the server, open this file in a browser:
file:///home/gatewaynode/.local/share/jupyter/runtime/jpserver-5822-open.html
Like "cookie secret", I’m not sure why this little detail is shared with the Notebook user. But if I’m trying to hack data scientists that I know are running a particular local webserver this is helpful. Now I’m not actually trying to hack data scientists, but who knows, maybe my corp will want to red team this vector some day.
Now I can start playing with tensors interactively. More specifically I can start breaking them, document failure modes and start building a fuzzer.